Performance Evaluation of OpenMP Applications with Nested Parallelism

نویسندگان

  • Yoshizumi Tanaka
  • Kenjiro Taura
  • Mitsuhisa Sato
  • Akinori Yonezawa
چکیده

Many existing OpenMP systems do not su ciently implement nested parallelism. This is supposedly because nested parallelism is believed to require a signi cant implementation e ort, incur a large overhead, or lack applications. This paper demonstrates Omni/ST, a simple and e cient implementation of OpenMP nested parallelism using StackThreads/MP, which is a ne-grain thread library. Thanks to StackThreads/MP, OpenMP parallel constructs are simply mapped onto thread creation primitives of StackThreads/MP, yet they are e ciently managed with a xed number of threads in the underlying thread package (e.g., Pthreads). Experimental results on Sun Ultra Enterprise 10000 with up to 60 processors show that overhead imposed by nested parallelism is very small (1-3% in ve out of six applications, and 8% for the other), and there is a signi cant scalability bene t for applications with nested parallelism.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Support and Efficiency of Nested Parallelism in OpenMP Implementations

Nested parallelism has been a major feature of OpenMP since its very beginnings. As a programming style, it provides an elegant solution for a wide class of parallel applications, with the potential to achieve substantial utilization of the available computational resources, in situations where outer-loop parallelism simply can not. Notwithstanding its significance, nested parallelism support w...

متن کامل

Nested Parallelism in the OMPi OpenMP/C Compiler

This paper presents a new version of the OMPi OpenMP C compiler, enhanced by lightweight runtime support based on user-level multithreading. A large number of threads can be spawned for a parallel region and multiple levels of parallelism are supported efficiently, without introducing additional overheads to the OpenMP library. Management of nested parallelism is based on an adaptive distributi...

متن کامل

Runtime Adjustment of Parallel Nested Loops

OpenMP allows programmers to specify nested parallelism in parallel applications. In the case of scientific applications, parallel loops are the most important source of parallelism. In this paper we present an automatic mechanism to dynamically detect the best way to exploit the parallelism when having nested parallel loops. This mechanism is based on the number of threads, the problem size, a...

متن کامل

A Microbenchmark Study of OpenMP Overheads under Nested Parallelism

In this work we present a microbenchmark methodology for assessing the overheads associated with nested parallelism in OpenMP. Our techniques are based on extensions to the well known EPCC microbenchmark suite that allow measuring the overheads of OpenMP constructs when they are effected in inner levels of parallelism. The methodology is simple but powerful enough and has enabled us to gain int...

متن کامل

Portable Support and Exploitation of Nested Parallelism in OpenMP

In this paper, we present an alternative implementation of the NANOS OpenMP runtime library (NthLib) that targets portability and efficient support of multiple levels of parallelism. We have implemented the runtime libraries of available opensource OpenMP compilers on top of NthLib, reducing thus their overheads and providing them with inherent support for nested parallelism. In addition, we pr...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000